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Huang's Empirical Mode Decomposition (EMD) is an algorithm for analyzing nonsta- 
tionary data that provides a localized time-frequency representation by decomposing the 
data into adaptively defined modes. EMD can be used to estimate a signal's instanta- 
neous frequency (IF) but suffers from poor performance in the presence of noise. To 
produce a meaningful IF, each mode of the decomposition must be nearly monochro- 
matic, a condition that is not guaranteed by the algorithm and fails to be met when the 
signal is corrupted by noise. In this work, the extraction of modes containing both signal 
and noise is identified as the cause of poor IF estimation. The specific mechanism by 
which such "transition" modes are extracted is detailed and builds on the observation 
of Flandrin and Goncalvcs that EMD acts in a filter bank manner when analyzing pure 
noise. The mechanism is shown to be dependent on spectral leak between modes and 
the phase of the underlying signal. These ideas are developed through the use of simple 
signals and are tested on a synthetic seismic waveform. 

Keywords: Empirical Mode Decomposition; intrinsic mode functions; instantaneous fre- 
quency; noise. 

1. Introduction 

Empirical Mode Decomposition (EMD) is an analysis tool for nonstationary data 
introduced by |Huang et al.\ m 1998. Nonstationary signals have statistical properties 
that vary as a function of time and should be analyzed differently than stationary 
data. Rather than assuming that a signal is a linear combination of predetermined 
basis functions, the data are instead thought of as a superposition of fast oscilla- 
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tions onto slow oscillations Flandrin and Goncalves (2004)] . EMD identifies those 



oscillations that arc intrinsically present in the signal and produces a decomposition 
using these modes as the expansion basis. We note that throughout this work the 
term "basis" is used in the same sense as used by Huang |1998j : the modes of a 
signal's decomposition do not span a particular space, but provide an expansion for 
the specific signal. In this way, the basis is data driven and adaptively defined each 
time a decomposition is performed Flandrin and Goncalves (2004)| . EMD has been 



used for data analysis in a variety of applications including engineering, biomedical, 
financial and geophysical sciences [Huang and Shen (2005)] . 



In contrast with Fourier analysis, EMD requires no assumptions on its input 
and is therefore well suited to analyze nonstationary data. Since nonstationarity 
implies that a signal is not well represented by pure tones, a significant number 
of harmonics is required to represent a nonstationary signal in the Fourier basis. 
Energy must be spread across many modes to accommodate deviations from a pure 
tone. In producing an adaptive decomposition consisting of modes that allow for 
such deviations, EMD efficiently represents the signal by relaxing the need to ex- 
plore all frequencies. A signal is expanded using only a small number of adaptively 
defined modes. 

As EMD is an algorithm and not yet a theoretical tool, its limits must 
be tested experimentally. Several authors have reported on its performance in 
the presence of noise Huang and Shen (2005)| Flandrin and Goncalves (2004)| 



Kijewski-Correa and Kareem (2007) . In this work, we propose a new understand- 
ing of the mechanism that prevents the algorithm from properly estimating the 
instantaneous frequency of a noisy signal. The paper is organized as follows. Sec- 
tion 2 gives a brief description of the EMD algorithm and demonstrates its use 
in estimating the instantaneous frequency of a clean signal. The same estimation, 
performed in the presence of noise, is seen to be problematic in Section 3 and the 
cause is identified. Section 4 outlines a new explanation for this poor performance. 
Finally synthetic seismic data are used in Section 5 to extend our study from simple 
signals to a model for real world data. 



2. Empirical Mode Decomposition 
2.1. Algorithm 

The goal of Empirical Mode Decomposition is to represent a signal as an expan- 
sion of adaptively defined basis functions with well defined frequency localization. 
Each basis function, called an Intrinsic Mode Function (IMF), should be physi- 
cally meaningful, representing ideally one frequency (nearly monochromatic). To 
accomplish this, an IMF is defined as a function for which (1) at any point, 
the mean of the envelopes defined by local maxima and minima is zero, and (2) 
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the number of extrema and the number of zero crossings differ by at most one 
Huang et al. (1998)] . Such a definition attempts to ensure that a meaningful in- 



stantaneous frequency can be obtained from each IMF, a process that is defined 
and detailed in the next subsection, but does not guarantee that each IMF is nar- 



row band Huang et al. (1998) . To decompose a signal x(t), the EMD algorithm 



works as follows Flandrin and Goncalves (2004)] : 



(1) Interpolate (usually with cubic splines) the local maxima of x(t) to form an 
envelope. Repeat for the minima. 

(2) Compute the mean, m(t), of the two envelopes. 

(3) Compute the detail, d(t), by subtracting the mean from the signal, d(t) = 
x(t) — m(t). Extract the detail as an IMF. 

(4) Repeat the iteration on the residual m(t). Continue until the residual is such 
that no IMF can be extracted and represents the trend. 

While the trend does not meet the definition of an IMF, we will refer to it as the 
final IMF for convenience. Before the detail, d(t), can be considered an IMF, a 
"sifting" process takes place during which the detail is treated as a new signal and 
is iterated until a predefined stopping criterion is reached. The purpose of this step 
is to enforce the definition of an IMF Flandrin and Goncalves (2004)] . Ideally, all 



modes arc now nearly monochromatic and can be used to give a meaningful esti- 
mate of the signal's instantaneous frequency. 

The algorithm can be described in the time-frequency domain as a collection 
of data-dependent projections. IQlhede an d Walden [2004 formalize this idea by 
defining projection operators Pr . , not necessarily orthogonal, that project a signal 
x(t) into regions Rj of the time- frequency plane. The signal may then be written 
as 

K 

*(t) = X)[Pfl,*](t), 

where K is the number of IMFs produced, with the Kth IMF being the trend. Since 
each projection gives rise to an IMF, an expansion of the signal is then given by 

K 

x(t)=Y,X j (t), 
i=i 

where X, is the jth IMF. 



2.2. Estimation of instantaneous frequency 

A signal is often characterized in terms of its frequency content. When a signal's 
statistical properties are shift-invariant in time, it is said to be stationary. As this 
definition implies, frequency remains constant throughout the signal's duration, and 
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is easily defined as the number of periods per unit time. However, if the signal's 
frequency varies with time, it is said to be nonstationary, and this global definition 
of frequency loses meaning. It is therefore necessary to characterize the frequency 
content of the signal in a local manner. For example, a chirp with a quadratic phase 
has frequency that changes linearly from one instant to the next. It is not possi- 
ble to pinpoint one frequency for the entire chirp. Instead the chirp's frequency is 
described as a (linear) function of time. It is therefore more useful to characterize 
such a signal in terms of its instantaneous frequency. 

Boashash |1992| describes instantaneous frequency (IF) as "a time-varying pa- 
rameter which defines the location of the signal's spectral peak as it varies with 
time." He points to seismic, radar, sonar, communication, and biomedical applica- 
tions as fields where IF is utilized. Two conditions are needed to produce a physically 
meaningful and well defined instantaneous frequency. The signal must be analytic 
and it must be narrow band. An analytic signal is produced via the Hilbert trans- 
form: 



[Hx}( t )=ipv r ^idf, 



where PV denotes the Cauchy principle value. Given a real valued signal, x(t), its 
analytic representation is then defined as z(t) = x(t) + i[Hx](t). The analytic signal 
z(t) may be written in the form 

z(t) =a(i)e^ (t) , 

and the instantaneous frequency, v(t), can then be defined Boashash (1992)| in 
terms of the derivative of the phase (f>(t) : 

The derivative must be well defined since physically there can only be one instan- 
taneous frequency value v(t) at a given time t. This is ensured by the narrow band 
condition: the signal must contain nearly one frequency. Further, as detailed by 
Boashash 1992 , the Hilbert transform produces a more physically meaningful re- 
sult the closer its input signal is to being narrow band. However, we wish to work 
with signals that are much more interesting than those that are monochromatic. 
This can be achieved by decomposing such a signal into several nearly monochro- 
matic components, each of which provides a well defined, meaningful instantaneous 
frequency. An overall IF estimate of a signal x, given its decomposition into K 
IMFs, is then calculated as a weighted sum of the individual IFs: 



IF(x(t)) 



E*i^(*)*j(*) 



where Aj (t) and Vj (t) are, respectively, the magnitude and instantaneous frequency 
of the analytic representation of IMF Xj Olhede and Walden (2004)] . 
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To demonstrate the calculation of IF, consider x(t) = sin(200i 2 ) + sin(20i), the 
superposition of a linear chirp onto a stationary sine wave, on the interval t 6 [0, 1] 
seconds. Figure Q}i shows the true analytic^ IF (in red) and the overall IF estimate 
(in blue) obtained from the IMFs (shown in figure [1})) of x(t). We are able to calcu- 
late a physically meaningful instantaneous frequency when using the decomposition 
of a signal in the absence of noise. 



Huang et al. [2009] give a detailed discussion on the shortcomings of this method 
of IF calculation. In particular, they note that the analytic signal obtained from the 
Hilbert transform is only physically meaningful if the conditions of the Bedrosian 
theorem are met. They introduce a normalization scheme that empirically sepa- 
rates the AM and FM components of each IMF, where the AM carries the envelope 
and the FM is the constant amplitude variation in frequency. The "normalized" 
FM component of an IMF is guaranteed to satisfy the Bedrosian theorem and is 
therefore suitable for the Hilbert transform. Alternatively, once an IMF has been 
normalized, | Huang et al.\ |2009j propose eschewing any Hilbert transform in favor 
of applying a 90 degree phase shift by means of a direct quadrature. Both methods 
are demonstrated to be more accurate on clean signals than the standard method 
presented above. Since the focus of our work is the performance of EMD in the 
presence of noise, the performance of this normalization scheme on noisy data will 
be addressed in the next section. 



a The analytic IF of the superposition of two signals, x(t) = Ai(t)e I< ^i( t ) + A2{t)e l 't' 2 ^ t \ is 
defined as the average of the individual IFs of each signal only when |Ai(t)| = |A2(t)| 
[Loughlin and Tacer (1997)] . We note that this condition holds for this example, and we compute 
the analytic IF accordingly. An example for which the condition does not hold will be encountered 
in section 5. 
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Fig. 1. The instantaneous frequency estimate and IMFs of a clean signal. 



3. Performance in the Presence of Noise 

A clean signal can produce a decomposition that lends itself to a meaningful in- 
stantaneous frequency estimate. However, as is the case in many applications, data 
are often contaminated by noise. Decomposing a noisy signal produces both narrow 
and wide band IMFs. While most of the wide band IMFs contain noise and may be 
discarded, a small number capture the transition from noise to signal and must be 
kept. This leads to a corrupted estimate of the instantaneous frequency. 



3.1. Evidence of a problem 

In the previous section the calculation of instantaneous frequency was described. 
This process is now applied to the same signal contaminated with additive white 
Gaussian noise such that its SNR is 27dB. Throughout this work we use SNR 
= 10 log 10 (^^-)dB, where a is the standard deviation of the noise. The result is 
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shown in figure [5] and it is clear that a meaningful instantaneous frequency estimate 
was not produced. 



Instantaneous Frequency 
500 I 1 1 1 




seconds 



Fig. 2. The corrupted instantaneous frequency estimate of a noisy signal. 

To understand this poor result, recall that a signal's IF is computed as a weighted 
sum of the IF from each of its IMFs. The analytic representation of each IMF is 
required and thus each IMF must be narrow band to ensure a meaningful Hilbert 
transform. Moreover, IF is well defined only in the case of a nearly monochromatic 
signal. Therefore, for the purpose of computing a meaningful IF, the key feature of 
the decomposition is that each IMF contains nearly one frequency. 

It is important to recall that the definition of an IMF does not guarantee 
monochromaticity. This is illustrated with a deterministic example. The decom- 
position of a signal composed of a slow sinusoid with high frequency sinusoids 
superimposed at each crest and trough is shown in figure [H Despite the fact that 
this signal was constructed in a completely deterministic manner, its first two IMFs 
contain both high and low frequencies. Such IMFs are not suitable for the Hilbert 
transform and will not yield a well defined IF. |Wu and Huang| [2009] use a very 
similar example, developed independently from our example, to note that a decom- 
position may give rise to IMFs containing oscillations of drastically different scales. 
They refer to the creation of such IMFs as "mode mixing," and introduce the En- 
semble EMD (EEMD) to alleviate this issue. We will discuss the performance of 
EEMD on noisy data in section 4. 
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Fig. 3. IMFs of a deterministic signal. IMFs 1 and 2 contain both high and low frequencies, 
illustrating that monochromaticity is not guaranteed. 

3.2. Identifying the culprit 

The poor quality IF estimate from a noisy signal can be explained by the cre- 
ation of wide band IMFs. More precisely, the EMD decomposition of a noisy signal 
will generate some "noisy" IMFs. As explained below, such noisy IMFs are neither 
monochromatic signals nor pure noise; rather their Fourier transform is localized 
over a well defined frequency range. Consequently, such IMFs cannot contribute a 
well defined IF because noise is wide band by definition. Figure |4] shows the decom- 
position of the noisy example signal. We identify three categories of IMFs: 

(1) Noisy: IMFs 1-4 are wide band as they clearly contain noise. 

(2) Transition: IMFs 5-7 contain both signal and noise. These IMFs capture the 
"transition" from the noise captured in IMFs 1-4 and the monochromatic com- 
ponents extracted as IMFs 8-11. 



Signal 
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(3) Monochromatic: IMFs 8-11 are nearly monochromatic and yield meaningful 
IF contributions. 



IMF 1 IMF 2 IMF 3 IMF 4 

L|_^g^^^ U^J^^||||L||y Ljjj^ijjXjy^J 

pwwi^i^^pm ^^^^^H^n ppif|p*mjif|ipfii 





Fig. 4. IMFs of a noisy signal. IMFs 1-4 capture most of the noise, while IMFs 5-7 represent the 
transition from noise to signal, and IMFs 8-11 are nearly monochromatic. 



To demonstrate the effect of each type of IMF on the overall IF estimate, figure 
O highlights an example from each category. IMF 2 (left) is a noisy IMF; IMF 5 
(center) contains both signal and noise and is a transition IMF; IMF 9 (right) is 
nearly monochromatic. Spectrograms^ arc used to illustrate the frequency content 
that characterizes each IMF. The spectrogram of the noisy mode, IMF 2, shows 
that it is wide band and therefore yields an IF that is not physically meaningful. 
In contrast, the nearly monochromatic IMF 9 is seen to be narrow band and con- 
tributes a well defined IF. Finally, despite its signal content, transition IMF 5 is 
wide band and cannot contribute a clean IF. The inclusion of IF contributions from 
wide band IMFs pollutes the overall IF and is responsible for the poor result seen 
in figure [2j 



b Spectrograms are displayed as a log-scale color representation of the power spectral density calcu- 
lated using a Kaiser window of duration 0.1 seconds with 90% overlap. Red and blue correspond to 
higher and lower density, respectively, and the scale is uniform within a figure but not necessarily 
throughout the paper. 
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Fig. 5. Characteristic IMFs representing (a) noise, (b) transition from noise to signal, and (c) 
monochromatic components extracted from a noisy signal. 



Since the inclusion of certain IMFs results in a poor IF estimate, it is reasonable 
that some nonlinear thresholding process would yield better results. Specifically, 
discarding the IMFs identified as noise will provide a more meaningful IF estimate. 
In figure [H] the IF of the noisy version of the signal shown in figure QJ,(top) is 
now computed using only IMFs 5-11. It is important to note that IMF 5 is not dis- 
carded because as a transition IMF, it contains both signal and noise. We would like 
to ignore such an IMF since it will provide poor IF information derived partially 
from noise, but cannot discard its signal content. Therefore, it must be included 
and contaminates our overall estimate. The same is true of IMFs 6 and 7. Other 
thresholding methods may be utilized, including using only those IMFs with energy 
between specified thresholds Huang and Shen (2005)] . However, to our knowledge, 
there is not a clear cut method of thresholding that will produce a faithful IF es- 
timation. While the thresholded estimate in figure |5] is an improvement over the 
previous estimate shown in figure [3J the transition IMFs' contribution has left the 
IF mostly incoherent. The necessary inclusion of transition IMFs is therefore iden- 
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tificd as the main problem in estimating the IF in the presence of noise. 
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Fig. 6. Instantaneous frequency estimate using IMFs 5-11. The necessary inclusion of transition 
IMFs prevents a clean estimation. 



It is also reasonable that computing the IF from normalized IMFs 
Huang et al. (2009)] (see section 2) might yield cleaner results. However, 



| Huang et al. [2009J note that the normalized scheme encounters problems when an 
IMF contains noise and recommend computing the analytic signal with the standard 
Hilbcrt transform approach. Figure [7] shows the normalized version of the exam- 
ple IMFs from figure [5j We observe that we still have (from left to right) a noisy 
IMF, a transition IMF, and a monochromatic IMF. The IF contribution from each 
normalized IMF is shown, calculated by direct quadrature (middle) and normalized 
Hilbert transform (bottom). Just as in the standard unnormalized case, transition 
IMFs with corrupted IF contributions still exist and their necessary inclusion will 
prevent a clean IF estimate (not shown). 
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Fig. 7. Normalized IMFs of a noisy signal (top), IF contribution from direct quadrature (middle), 
and IF contribution from normalized Hilbert transform (bottom). 



4. Analysis of Noisy Decompositions 

With an understanding of how transition IMFs pollute the estimation of IF, we 
address the more fundamental question of why transition IMFs arc produced when 
EMD operates on a noisy signal. To begin, we note the work of Flandrin and 
Goncalves |2004j showing that EMD acts as a filter bank when decomposing pure 
noise, and add our observation that the boundaries of the frequency bands vary with 
time. We propose two mechanisms that lead to the creation of transition IMFs: 

(1) Spectral leak between frequency bands: frequency content of the underlying 
signal falls within a band treated as noise. 

(2) Phase alignment: the alignment of the signal with the lowest level of noise 
present in the band is controlled by the signal's phase. 

Spectral leak is mostly a nonstationary condition while the contribution of phase 
alignment is best seen in the stationary setting. 
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4.1. EMD decomposition of pure white noise 

Before returning to the decomposition of a noisy signal, EMD's performance on 
pure noise is analyzed. Figure [5] shows the spectrogram of a realization of white 
Gaussian noise (zero mean, standard deviation of 0.2). It is not surprising that the 
spectrogram shows nearly uniform power spectral density since, in principle, the 
density of such noise should be constant. This specific noise realization will be used 
in all experiments that follow in this section. 




0.1 0.2 Q.'i 0.4 0.5 0.6 0.7 0.8 0.9 

seconds 



Fig. 8. Spectrogram of white Gaussian noise used throughout this section. 



Flandrin and Goncalves |2004| reported that EMD acts as a filter bank when 
decomposing pure Gaussian noise. By selecting entire frequency bands as IMFs 
rather than a single frequency, the IMFs are by definition multicomponcnt. We 
observe a similar result and note that the boundaries of each band are not straight 
line cuts through the frequency axis, but instead vary as a function of time. This is 
clearly seen in the IMFs of the noise as their spectrograms (figure [5]) show that the 
borders of the frequency-bands do not resemble straight lines. The spectrograms 
also reveal that the IMFs provide a nearly dyadic decomposition of the spectrum 
shown in figure|SJ Since the noise is composed of realizations of random variables, we 
define its mean power spectral density M psc i(t) and associated standard deviation 
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SD psd {t) at a given time t as follows: 

F„/2 

M psd (t) = Y,k-P(k,t) 

k=0 
Fs/2 

M* sd (t) = k 2 ■ P(k,t) 

SD psd (t) = \/ M* sd (t) ~ (M psd (t)) 2 

where F s is the sampling rate and P(k,t) is the normalized power spectral density 
at frequency k and time t. The plots of the mean power spectral density with error 
bars representing one standard deviation show that the statistics of the IMFs vary 
with time (figure [TU]). Some frequency mixing between modes is also observed. 
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Fig. 9. Spectrogram of first six IMFs of white Gaussian noise, highlighting EMD's filter bank 
behavior. 




Fig. 10. Mean (with error bars representing one standard deviation) power spectral density of 
IMFs extracted from white Gaussian noise. Note the different scales on the frequency axis, clearly 
indicating an almost dyadic decomposition of the noise spectrum. 
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4.2. EMD decomposition of a signal corrupted by noise 
4.2.1. Spectral leak 

Kijcwski-Correa and Kareem [2007] attributed the poor quality of IF estimation in 
the presence of noise to the empirical nature of the algorithm, leading to a basis 
derived from the noise. They observed the mixing of the input signal over many 
IMFs, making it difficult to isolate the clean signal from the noise. We extend this 
explanation with our observations to explain the extraction of transition IMFs. The 
process is best understood by considering the noisy signal in the time-frequency 
plane. The algorithm is operating on projections in this plane, starting with the 
highest frequency band and adaptively moving down the frequency axis. These 
projections arc not completely orthogonal, and thus there is some frequency mixing 
in the modes. As EMD tiles down the time-frequency plane, it first extracts pure 
noise as it has not yet reached the frequency of the signal. While in the pure noise 
region, EMD behaves as a filter bank, as observed by Flandrin and Goncalvcs, 
extracting noise in an almost dyadic manner. 
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Fig. 11. A model of EMD's filter bank action shown in the time-frequency plane. Pieces of chirping 
signal are captured in noisy bands. The bands contributing to IMFs 1-4 are illustrated and the 
boundaries between the bands are idealized. 
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A model for this process in the time-frequency plane is provided in figure 1111 
The model shows a spectrogram of the noisy chirp sin(27r/t 2 ) +n(t), where / = 225 
Hz, t E [0, 1], and n{t) is the exact same realization of noise shown in figure [5] The 
boundaries between the bands are idealized, highlighting EMD's filter bank behav- 
ior. Noise is removed until a frequency present in the signal matches or exceeds that 
of the noise. The model demonstrates the situation where a portion of a nonsta- 
tionary signal leaks into an otherwise noisy band (IMF 3 in this example). In this 
case, the signal's frequency is high enough to be included in the IMF for only part 
of its duration. Still behaving in the noise regime, EMD extracts both signal and 
noise as it cannot distinguish which should be removed. Because of the variation 
in the boundaries of the identified frequency bands (seen in figures |H] and [TU1 not 
shown in the model), EMD will encounter such a band even when decomposing a 
noisy stationary signal. This is the general process that leads to the creation of a 
transition IMF, and will be seen explicitly in the following example. 
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Fig. 12. Decomposition of a noisy linear chirp. Note the signal content present in the transition 
IMFs 4-6. 
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To demonstrate the extraction of transition IMFs, we add the exact same re- 
alization of noise shown in figure H]to the linear chirp sin[27r(35t 2 + 10*)]. The 
decomposition of this noisy signal is shown in figure [T^] and spectrograms of the 
first six IMFs are shown in figure HU1 IMFs 1-3 show the filter bank action of EMD. 
The frequency of the signal is well below that of the noise, and EMD extracts the 
noise in a nearly dyadic fashion. We note the boundaries of the frequency bands 
vary with time, as expected. Once IMFs 1-3 have been removed, the next frequency 
band selected contains both noise and signal as can be seen in the spectrogram of 
IMF 4 (sec figure [T3")) . The noise remaining in the residual forces EMD to continue 
behaving as a filter bank. However, the highest frequency content of the chirp now 
falls within this band. In removing this band, a portion of the signal is pushed 
into IMF 4. In this respect, we observe the signal leaking into the noise. IMF 4 
will be composed of a mixture of noise and signal: noise for the temporal locations 
corresponding to those where the chirp's frequency is too low to be included; signal 
for the temporal locations where the chirp's frequency reaches into the noise band. 
Thus a transition IMF is produced, containing signal that has been prematurely 
removed. Because this portion of signal no longer remains in the residual, it cannot 
be accounted for in the next IMF. Therefore, subsequent IMFs will be damaged as 
each is derived from the remaining incomplete residual. This process continues for 
IMFs 5 and 6, and the portions of the chirp that leak into the empirically defined 
bands arc removed with the noise in a manner similar to IMF 4. We see the forma- 
tion of transition IMFs is consistent with the model presented in figure [TT] 

Spectral leak is similar to the mode mixing observed by |Wu and Tl uang [2009 . 
To resolve the mode mixing issue, they introduce EEMD to produce IMFs that 
represent only one scale of oscillation. EEMD cleverly uses noise perturbations to 
force the algorithm to explore all frequencies while not adding too much noise so as 
to push the algorithm into the spectral leak regime. Noise is added to the original 
signal and a standard EMD decomposition is performed. This is repeated with dif- 
ferent noise realizations for a fixed number of times. The resulting IMFs from each 
run are then averaged, producing an "ensemble" result. [Wu and Huang| demonstrate 
that this is an effective way of eliminating mode mixing even in signals that contain 
a mild amount of noise. Our analysis continues this line of thought by examining 
decompositions of signals with noise of higher levels, as is often encountered in real 
world data. It is this noise that causes spectral leak between IMFs and presents a 
different problem than that solved by EEMD. Adding more noise to the already 
contaminated signal will not produce cleaner results. The realization of the original 
contaminating noise remains the same over all trials and thus cannot be eliminated 
through averaging. For these reasons, our analysis is focused on the standard EMD 
decomposition of noisy signals. 
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Fig. 13. Spectrograms of the decomposition of a noisy linear chirp. Transition IMFs 4-6 display 
the spectral leak of signal into noise. Note the change in scale on the frequency axis. 



4.2.2. Phase alignment 

The spectral explanation is not the entire story; the phase of the underlying signal 
also plays a role in the creation of transition IMFs. We have seen that the bound- 
aries of the frequency bands of noisy IMFs dip lower in some locations and extend 
higher in others (figure O ■ We also have observed that the standard deviation of 
a band's frequency varies with time (figure ITU]) . When the energy of the noise is 
high, the energy of the signal cannot be felt by the algorithm. In this way, we think 
of the noise as insulating the signal from extraction. However, at a given time, if 
the energy of the noise is small, EMD may include part of the underlying signal 
in the current IMF as well. At these time locations, the noise does not insulate 
the signal from extraction. Thus signal leaks into an otherwise noisy IMF at the 
locations where the standard deviation is small. This process is illustrated by the 
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model seen in figure 1141 showing a noisy signal in the time-frequency plane. From 
0.5 to 0.6 seconds there is a clear dip in the energy of the noise. In this region, the 
energy of the signal is exposed and will be extracted into the next IMF. Outside 
of this region, the energy of the noise is high and insulates the underlying signal. 
Here, only the noise will be extracted and the signal will remain untouched. The 
locations at which signal is extracted into an otherwise noisy IMF will be shown to 
be phase dependent. 




Fig. 14. A model of a noisy signal in the time-frequency plane. Signal will be extracted in the 
region corresponding to 0.5-0.6 seconds. Here the energy of the noise is too low to insulate the 
signal from extraction. Outside of this region, only the energy of the noise will be extracted. 

Consider two signals with identical spectral content, differing only by a constant 
phase factor and contaminated with the same noise realization. For simplicity, we 
consider two stationary signals. Using a stationary example will limit the effect of 
spectral leak, as unlike the chirp used in the previous nonstationary case, a signal 
with one frequency should not have energy spread over many IMFs. Let / = 75 
Hz and t G [0, 1] seconds. We examine X\ = sin(27r/i) and a phase-shifted copy 
x% = sin(27r/t + .9p), where p — j is the period of x\. Because x\ and X2 have the 
same frequency content, we expect that when contaminated with the same noise 
realization, EMD should produce very similar results. Figure [T5l shows that the first 
transition IMFs for each noisy signal contain signal in different locations. Examining 
the residual from which each transition IMF was extracted lends an explanation. 
The smallest standard deviation in each residual occurs near 0.7 seconds and 0.4 
seconds for x\ and X2 respectively and is highlighted in red. These time locations 
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correspond exactly with the location of signal content in each transition IMF. At 
these locations, the level of the noise is too small to insulate the signal from ex- 
traction into the current IMF. This process demonstrates that the extraction of 
transition IMFs is also phase dependent. 




Fig. 15. Two stationary signals with identical spectral content differing only by a phase shift. From 
top to bottom: the clean signal, spectrograms of the noisy residual from which the first transition 
IMFs are extracted, mean power spectral density (PSD) of the residual with error bars representing 
one standard deviation, and the first transition IMFs. The mean PSD sections highlighted in red 
(around 0.7s for x\ and 0.4s for X2) correspond to those with the smallest standard deviations and 
is where signal leaks into the otherwise noisy IMFs. 
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In the above example, the first IMF contains pure noise for both signals. Be- 
cause the exact same noise realization was used to contaminate both signals, one 
might expect that the first IMF, and thus the first residual, for each signal would 
be identical. However, as noted above and seen in figure [T3J the statistics of the 
residuals are different, showing dips in the energy of the noise at different locations. 
For a more complete understanding of the demonstrated phase dependence, we con- 
sider how the phase of a signal interacts with noise. The interference between the 
sinusoidal function Xi(t) = acos(ujt + Pi) (i = 1 or 2) and a realization n{t) of the 
white noise can be described by the following simple model. We consider n(t) to 
be a realization of a white noise process sampled at a finite number of samples N. 
We can decompose n(t) using a finite Fourier transform Brillinger (2001)] and the 
Fourier series expansion can be written as follows: 

N-i t 
n(t) =Y Pk cos(27rfc— + <p k ) 

k=Q 

where the pk > and (pk arc defined by 

a k =pkCOS(pk, b k = -pksinipk, and a = 2p cos(p , 

with 

2 N ~ 1 t 2 W_1 t 

ak = TV 51 "(0c!os(27rfc — ) (k = Q, . . .) and b k = ^ n(t) sm(2irk— ) (k = 1, ...). 

i=0 t=0 

We now contaminate the signal Xi(t) by adding the noise realization n(t) to Xi(t), 

n-i t 

Xi(t)+n(t) =acos(ut + Pi)+ ^ p k cos(2?rA;-— + tp k ) (t = 0,1, . . . ,N — 1). 

k=0 

Because the noise is white, we expect the realization of the noise to have a uniform 
distribution of the energy in the Fourier domain. In other words, we expect that all 
Pk have similar amplitudes. 

We now examine under what circumstances the noise will interfere with the 
signal. First, we assume that the signal amplitude is about the same as the noise 
level, (a « p ko ). Second, we consider the frequency index of the noise that matches 
the frequency of the signal, kg such that uj 27t/co- At this frequency the noise will 
interfere with the signal. Formally, we can consider the interaction of the two cosine 
function, 

a cos (uj^ + ft) + p k „ cos ^27rfc -^ + <Pk„ 

+ 27rfco t 6i + ip ko \ (ui - 27rfco t Pi - <Pk 



2pk„ cos , 
If lu ss 2Trko, then the function 

27rfc t_ t Pi - ipk a 
2 N 
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slowly modulates the other cosine function, 



pk cos 



( 



2 N 2 



) 



which still oscillates at the frequency uj since (lu + 27rfco)/2 w. The overall ampli- 
tude of the slowly varying envelope cos((a; — 27r/co)/2 i/iV + — (fk )/2) clearly 
depends on the phase difference (/3j — tpf. )/2, as is shown in figure [T"5l 

We conclude that the exact amount of cancellation created by the interference 
between the original signal Xi(t) and the noise realization n(t) depends on the phase 
of the signal Xi(t). We note that this analysis is concerned with one realization of the 
noise, and is not in contradiction with the fact that the noise statistical properties 
are translation invariant, since the noise is considered to be stationary. 



5. EMD Decomposition of Synthetic Seismic Data 

Having demonstrated both the effect and mechanism of noise corruption on simple 
synthetic examples, we turn our attention to a synthetic seismic signal which will 
serve as a model for real world data. The signal was constructed using elemen- 
tary chirplet wave packets. Such chirplet packets were proposed bylBardainn c~ef al. I 
|2006j to decompose scismograms. Details of the construction are given in Appendix 
A. Figure [T6k shows the clean signal that will be considered along with the esti- 
mate of its instantaneous frequencjEI. In the absence of noise we observe that the 
decomposition of the signal yields a physically meaningful IF (figure 116b ) . 



c This synthetic seismic waveform is the result of the superposition of several signals, each with 
different frequency and amplitude functions. Therefore, the waveform is a multicomponcnt signal 
and its analytic IF is not well defined. The IF must be computed numerically (as the weighted 
sum of the IF from each of its IMFs) as shown in figure [T6b . 
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Fig. 16. Clean seismic signal from which a physically meaningful IF is calculated. 
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Fig. 17. Noisy seismic signal (SNR = 24dB) from which a physically meaningful IF cannot be 
calculated. 
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To investigate the effect of noise, the same signal is contaminated with additive 
white Gaussian noise and we consider an SNR of 24dB. The noisy signal is shown 
in figure 117b , and it is clear that a meaningful IF was not produced (figure 117b ) . 
Examining the IMFs of the noisy signal shows that IMF 1 contains noise and IMF 
2 represents the transition from noise to signal. It is noted that 91.8% of the sig- 
nal's total energy is captured in this transition IMF. Eleven IMFs were produced 
and figure IT51 shows the first five, capturing 98.6% of the energy. It is clear that to 
produce a meaningful instantaneous frequency, IMF 1 must be discarded. IMF 2 
must be included as it contains almost all of the energy, but will be problematic as 
it also contains noise. Recomputing the IF (not shown) using all but the first IMF 
fails to produce a meaningful IF estimate due to the noise present in IMF 2. 



IMF 1 



IMF 1 



5 

-5 

2 

-2 

0.5 



-0.5 





0.2 0.4 0.6 

IMF 2 


0.8 


■ m — 




0.2 0.4 0.6 

IMF 3 


0.8 






0.2 0.4 0.6 

IMF 4 


0.8 










0.2 0.4 0.6 

IMF 5 


0.8 




2000 
g 1000 


2000 



1 1 1 




0.2 OA 0.6 

IMF 2 

' " ' • •' — 


0.8 






0.2 0.4 0.6 

IMF 3 


0.8 



0.2 0.4 0.6 0.1 

seconds 




(a) IMFs 



(b) Spectrograms of IMFs 



Fig. 18. First five IMFs with spectrograms from the decomposition of the noisy seismic signal. 
91.8% of the total energy is captured in transition IMF 2. IMFs 3-5 are damaged by the extraction 
of signal into IMF 2. 
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The seismic signal is clearly nonstationary. We therefore expect that the tran- 
sition IMF was formed due to spectral leak. The IMFs in figure [T5] indicate that 
the decomposition indeed followed the process presented in the model for spectral 
leak ( figure ITT]) . IMF 1 is pure noise, extracted by EMD operating in the filter bank 
regime. The spectrogram of IMF 2 shows that EMD continued down the frequency 
axis in a somewhat dyadic fashion. In principle, IMF 2 would have contained only 
pure noise, but the frequency content of the signal leaked into the bottom of this 
frequency band. The spectrograms of IMFs 3-5 show that the extraction of signal 
into the transition IMF damaged all subsequent IMFs. 

Finally, there is also evidence of phase dependence. Let the original signal be 
denoted by x, and consider x\ and X2, two phase-shifted copies of x with identical 
spectral content. Phase shift is accomplished by adding a constant c to the argument 
of the sine in the wave packet Wk{t) (see Appendix A). The values used for c are 
0.9-7T and 0.37T for X\ and £2, respectively. Figure [T9l shows the transition from noise 
to signal is captured in IMF 2 for x and x\. Although subtle, these IMFs contain 
signal at different locations (most easily seen at 0.6 seconds). A more obvious effect 
is seen in the decomposition of X2 , where the transition begins in IMF 1 instead of 
IMF 2. 
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Fig. 19. First two IMFs of noisy seismic signals differing only by a phase factor. IMF 2 is the 
transition IMF for x and xi, while the transition begins in IMF 1 for X2- The transition IMFs 
for x and xi contain signal content in slightly different locations, most notable at time t = 0.6 
seconds. 
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6. Conclusions 

All data analysis tools are susceptible to noise corruption; EMD is not an excep- 
tion. Despite this reality, EMD has emerged as an effective tool for nonstationary 
data analysis. Wavelet decompositions, which suffer from similar corruption in the 
presence of noise, are accompanied by rich theory from which this noise corruption 
may be studied and understood. A complete theoretical framework for EMD has 
yet to emerge. Therefore, EMD is best understood through experiments to discover 
and test its limits. EMD is an effective tool for estimating the IF of a clean signal 
but provides a poor estimate in the presence of noise. When decomposing a noisy 
signal, "transition" IMFs are extracted, capturing both noise and signal in the same 
mode. Such IMFs are problematic as their noise pollutes the IF calculation yet their 
signal content cannot be ignored. We have demonstrated both the existence of and 
mechanism by which transition IMFs are created. Specifically, transition IMFs arise 
from spectral leak between modes and EMD's filter bank behavior in the presence 
of noise. In addition, the manner in which signal leaks into an otherwise noisy IMF 
has been shown to be phase dependent. Given this understanding, there is an op- 
portunity to more faithfully estimate instantaneous frequency in the presence of 
noise. In doing so, care must be taken to treat transition IMFs in a manner that 
preserves any meaningful physical information, as this is an idea at the core of the 
development of EMD. 



Appendix A. Seismic Waveform 



The synthetic seismic waveform, fit), used in section 5 is based on the work of 
IBardainne et al.\ |2006j and is constructed as follows: 

Let /(f) = Et=i a kw k ((f - t k )/d k ) , t e [0, 1] 

• Wave packet u>k{t) = g(t) sin [2n(f k + Pkt qk )t] 

• Envelope g(t) = two Gaussians smoothly glued: 

2" 



exp 
1 

exp 



£c fc (l-i fc ) 



/ c k + (l-c k )l k -t \ ' 
\i(l-c k )(.l-l k )J 



if < f < c fc (l - l k ) 

if c k (l-l k ) < t < c k + (1 - c k )l k 

if c fc + (l-c fc )Zfc < f < 1 



• [fk,Pk, 9fe) control the frequency of the wave packet 

• (cfe, lk) control the boundary between the attack and the silencing of the wave 
packet 



The parameter values used are shown in table [T] below. 
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Table 1. Parameters used for construction of seismic 
waveform. 
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